9 research outputs found

    DRPT: Disentangled and Recurrent Prompt Tuning for Compositional Zero-Shot Learning

    Full text link
    Compositional Zero-shot Learning (CZSL) aims to recognize novel concepts composed of known knowledge without training samples. Standard CZSL either identifies visual primitives or enhances unseen composed entities, and as a result, entanglement between state and object primitives cannot be fully utilized. Admittedly, vision-language models (VLMs) could naturally cope with CZSL through tuning prompts, while uneven entanglement leads prompts to be dragged into local optimum. In this paper, we take a further step to introduce a novel Disentangled and Recurrent Prompt Tuning framework termed DRPT to better tap the potential of VLMs in CZSL. Specifically, the state and object primitives are deemed as learnable tokens of vocabulary embedded in prompts and tuned on seen compositions. Instead of jointly tuning state and object, we devise a disentangled and recurrent tuning strategy to suppress the traction force caused by entanglement and gradually optimize the token parameters, leading to a better prompt space. Notably, we develop a progressive fine-tuning procedure that allows for incremental updates to the prompts, optimizing the object first, then the state, and vice versa. Meanwhile, the optimization of state and object is independent, thus clearer features can be learned to further alleviate the issue of entangling misleading optimization. Moreover, we quantify and analyze the entanglement in CZSL and supplement entanglement rebalancing optimization schemes. DRPT surpasses representative state-of-the-art methods on extensive benchmark datasets, demonstrating superiority in both accuracy and efficiency

    Dynamic Object Tracking for Quadruped Manipulator with Spherical Image-Based Approach

    Full text link
    Exactly estimating and tracking the motion of surrounding dynamic objects is one of important tasks for the autonomy of a quadruped manipulator. However, with only an onboard RGB camera, it is still a challenging work for a quadruped manipulator to track the motion of a dynamic object moving with unknown and changing velocities. To address this problem, this manuscript proposes a novel image-based visual servoing (IBVS) approach consisting of three elements: a spherical projection model, a robust super-twisting observer, and a model predictive controller (MPC). The spherical projection model decouples the visual error of the dynamic target into linear and angular ones. Then, with the presence of the visual error, the robustness of the observer is exploited to estimate the unknown and changing velocities of the dynamic target without depth estimation. Finally, the estimated velocity is fed into the model predictive controller (MPC) to generate joint torques for the quadruped manipulator to track the motion of the dynamical target. The proposed approach is validated through hardware experiments and the experimental results illustrate the approach's effectiveness in improving the autonomy of the quadruped manipulator

    Combating Data Imbalances in Federated Semi-supervised Learning with Dual Regulators

    Full text link
    Federated learning has become a popular method to learn from decentralized heterogeneous data. Federated semi-supervised learning (FSSL) emerges to train models from a small fraction of labeled data due to label scarcity on decentralized clients. Existing FSSL methods assume independent and identically distributed (IID) labeled data across clients and consistent class distribution between labeled and unlabeled data within a client. This work studies a more practical and challenging scenario of FSSL, where data distribution is different not only across clients but also within a client between labeled and unlabeled data. To address this challenge, we propose a novel FSSL framework with dual regulators, FedDure.} FedDure lifts the previous assumption with a coarse-grained regulator (C-reg) and a fine-grained regulator (F-reg): C-reg regularizes the updating of the local model by tracking the learning effect on labeled data distribution; F-reg learns an adaptive weighting scheme tailored for unlabeled instances in each client. We further formulate the client model training as bi-level optimization that adaptively optimizes the model in the client with two regulators. Theoretically, we show the convergence guarantee of the dual regulators. Empirically, we demonstrate that FedDure is superior to the existing methods across a wide range of settings, notably by more than 11% on CIFAR-10 and CINIC-10 datasets

    Mobile game for active learning

    No full text
    Due to the interest of the Teaching, Learning and Pedagogy Division (TLPD) of NTU, this project aims to develop a social mobile game which combines learning and teaching. Students are able to gain knowledge through gaming, and teachers can access and analyze students’ answers accordingly. With Unity3D as the game engine, an Android game was developed to meet the requirements. As a result, this project is able to acts as the framework for education game, and the same approach of developing large software project will be adapted in other development environment.Bachelor of Engineering (Computer Engineering

    An Offline Weighted-Bagging Data-Driven Evolutionary Algorithm with Data Generation Based on Clustering

    No full text
    In recent years, a variety of data-driven evolutionary algorithms (DDEAs) have been proposed to solve time-consuming and computationally intensive optimization problems. DDEAs are usually divided into offline DDEAs and online DDEAs, with offline DDEAs being the most widely studied and proven to display excellent performance. However, most offline DDEAs suffer from three disadvantages. First, they require many surrogates to build a relatively accurate model, which is a process that is redundant and time-consuming. Second, when the available fitness evaluations are insufficient, their performance tends to be not entirely satisfactory. Finally, to cope with the second problem, many algorithms use data generation methods, which significantly increases the algorithm runtime. To overcome these problems, we propose a brand-new DDEA with radial basis function networks as its surrogates. First, we invented a fast data generation algorithm based on clustering to enlarge the dataset and reduce fitting errors. Then, we trained radial basis function networks and carried out adaptive design for their parameters. We then aggregated radial basis function networks using a unique model management framework and demonstrated its accuracy and stability. Finally, fitness evaluations were obtained and used for optimization. Through numerical experiments and comparisons with other algorithms, this algorithm has been proven to be an excellent DDEA that suits data optimization problems
    corecore